Goto

Collaborating Authors

 data observation


Free energy score space

Neural Information Processing Systems

Score functions induced by generative models extract fixed-dimension feature vectors from different-length data observations by subsuming the process of data generation, projecting them in highly informative spaces called score spaces. In this way, standard discriminative classifiers are proved to achieve higher performances than a solely generative or discriminative approach. In this paper, we present a novel score space that exploits the free energy associated to a generative model through a score function. This function aims at capturing both the uncertainty of the model learning and local compliance of data observations with respect to the generative process. Theoretical justifications and convincing comparative classification results on various generative models prove the goodness of the proposed strategy.


Optimal Learning

arXiv.org Artificial Intelligence

This paper studies the problem of learning an unknown function $f$ from given data about $f$. The learning problem is to give an approximation $\hat f$ to $f$ that predicts the values of $f$ away from the data. There are numerous settings for this learning problem depending on (i) what additional information we have about $f$ (known as a model class assumption), (ii) how we measure the accuracy of how well $\hat f$ predicts $f$, (iii) what is known about the data and data sites, (iv) whether the data observations are polluted by noise. A mathematical description of the optimal performance possible (the smallest possible error of recovery) is known in the presence of a model class assumption. Under standard model class assumptions, it is shown in this paper that a near optimal $\hat f$ can be found by solving a certain discrete over-parameterized optimization problem with a penalty term. Here, near optimal means that the error is bounded by a fixed constant times the optimal error. This explains the advantage of over-parameterization which is commonly used in modern machine learning. The main results of this paper prove that over-parameterized learning with an appropriate loss function gives a near optimal approximation $\hat f$ of the function $f$ from which the data is collected. Quantitative bounds are given for how much over-parameterization needs to be employed and how the penalization needs to be scaled in order to guarantee a near optimal recovery of $f$. An extension of these results to the case where the data is polluted by additive deterministic noise is also given.


Hierarchical Clustering in Machine Learning - Analytics Vidhya

#artificialintelligence

This article was published as a part of the Data Science Blogathon. Hierarchical clustering is one of the most famous clustering techniques used in unsupervised machine learning. K-means and hierarchical clustering are the two most popular and effective clustering algorithms. The working mechanism they apply in the backend allows them to provide such a high level of performance. In this article, we will discuss hierarchical clustering and its types, its working mechanisms, its core intuition, the pros and cons of using this clustering strategy and conclude with some fundamentals to remember for this practice.


Free energy score space

Neural Information Processing Systems

Score functions induced by generative models extract fixed-dimension feature vectors from different-length data observations by subsuming the process of data generation, projecting them in highly informative spaces called score spaces. In this way, standard discriminative classifiers are proved to achieve higher performances than a solely generative or discriminative approach. In this paper, we present a novel score space that exploits the free energy associated to a generative model through a score function. This function aims at capturing both the uncertainty of the model learning and local compliance of data observations with respect to the generative process. Theoretical justifications and convincing comparative classification results on various generative models prove the goodness of the proposed strategy.


GP-Localize: Persistent Mobile Robot Localization using Online Sparse Gaussian Process Observation Model

arXiv.org Machine Learning

Central to robot exploration and mapping is the task of persistent localization in environmental fields characterized by spatially correlated measurements. This paper presents a Gaussian process localization (GP-Localize) algorithm that, in contrast to existing works, can exploit the spatially correlated field measurements taken during a robot's exploration (instead of relying on prior training data) for efficiently and scalably learning the GP observation model online through our proposed novel online sparse GP. As a result, GP-Localize is capable of achieving constant time and memory (i.e., independent of the size of the data) per filtering step, which demonstrates the practical feasibility of using GPs for persistent robot localization and autonomy. Empirical evaluation via simulated experiments with real-world datasets and a real robot experiment shows that GP-Localize outperforms existing GP localization algorithms.